{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.PP-YOLOv2 Introduction\n",
    "\n",
    "As an important algorithm for object detection, the YOLO series adopts the one-stage method to greatly improve the detection speed, but the speed improvement also sacrifices some of the accuracy as a cost. Therefore, how to improve the accuracy of YOLOv3 while maintaining the speed of reasoning has become a key issue in its practical application.PP-YOLOv2 (R50) mAP in the COCO test dataset rises from 45.9% to 49.5%, an increase of 3.6 percentage points compared to v1. FP32 FPS is up to 68.9FPS, FP16 FPS is up to 106.5FPS, surpassing YOLOv4 and even YOLOv5! If RestNet101 is used as the backbone network, PP-YOLOv2 (R101) has up to 50.3% mAP and 15.9% faster than YOLOv5x with the same accuracy!\n",
    "\n",
    "The PP-YOLO model is officially produced by PaddlePaddle and is a model of the YOLOv3 optimized and improved by PaddleDetection. More information about PaddleDetection can be found here https://github.com/PaddlePaddle/PaddleDetection.\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Model Effects and Application Scenarios\n",
    "### 2.1 Object Detection Tasks:\n",
    "\n",
    "#### 2.1.1 Datasets:\n",
    "\n",
    "The dataset is mainly in COCO format, which is divided into training set and test set.\n",
    "\n",
    "#### 2.1.2 Model Effects:\n",
    "\n",
    "\n",
    "The detection effect of PP-YOLOv2 on the picture is:\n",
    "\n",
    "<div align=\"center\">\n",
    "<img src=\"https://user-images.githubusercontent.com/23690325/198869600-b7a549db-2cc6-49b1-8009-937fb5abe992.png\"  width = \"80%\"  />\n",
    "</div>\n",
    "\n",
    "<div align=\"center\">\n",
    "<img src=\"https://user-images.githubusercontent.com/23690325/198869611-451eda5f-eda6-4717-902c-b9b06070bc72.png\"  width = \"80%\"  />\n",
    "</div>\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. How to Use the Model\n",
    "\n",
    "### 3.1 Model Inference:\n",
    "* Download \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "%mkdir -p ~/work\n",
    "%cd ~/work/\n",
    "!git clone https://github.com/PaddlePaddle/PaddleDetection.git\n",
    "\n",
    "%cd PaddleDetection\n",
    "%mkdir -p demo_input demo_output\n",
    "!pip install -r requirements.txt"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* Verify whether the installation was successful or not.\n",
    "If an error is reported, only perform the previous step."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Whether the installation was successful or not.\n",
    "!python ppdet/modeling/tests/test_architectures.py"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* Quick experience\n",
    "\n",
    "Congratulations! Now that you've successfully installed PaddleDetection, let's get a quick feel at object detection."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Predict a picture on the GPU.\n",
    "!export CUDA_VISIBLE_DEVICES=0\n",
    "!wget -P demo_input -N https://paddledet.bj.bcebos.com/modelcenter/images/General/000000014439.jpg\n",
    "!python tools/infer.py -c configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml -o use_gpu=true weights=https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams --infer_img=demo_input/000000014439.jpg"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "An image with the predicted result is generated under the output folder.\n",
    "\n",
    "The result is as follows:\n",
    "<div align=\"center\">\n",
    "<img src=\"https://bj.bcebos.com/v1/paddledet/modelcenter/images/ppyolov2_infer.jpg\"  width = \"60%\"  />\n",
    "</div>\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3.2 Model Training\n",
    "* Clone the PaddleDetection repository (see 3.1 for details)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* Prepare the datasets."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* Change yaml configurations files.\n",
    "\n",
    "\n",
    "\n",
    "Change yaml configurations files``` configs/runtime.yml```\n",
    "\n",
    "```\n",
    "use_gpu: true \n",
    "log_iter: 20  \n",
    "save_dir: output \n",
    "snapshot_epoch: 1 \n",
    "print_flops: false\n",
    "\n",
    "```\n",
    "Change yaml configurations files``` configs/datasets/coco_detection.yml```\n",
    "\n",
    "```\n",
    "metric: COCO    \n",
    "num_classes: 1  \n",
    "\n",
    "TrainDataset:\n",
    "  !COCODataSet\n",
    "    image_dir: WIDER_train/images   \n",
    "    anno_path: WIDERFaceTrainCOCO.json  \n",
    "    dataset_dir: dataset/wider_face \n",
    "    data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']\n",
    "\n",
    "EvalDataset:\n",
    "  !COCODataSet\n",
    "    image_dir: WIDER_val/images     \n",
    "    anno_path: WIDERFaceValCOCO.json   \n",
    "    dataset_dir: dataset/wider_face\n",
    "\n",
    "TestDataset:\n",
    "  !ImageFolder\n",
    "    anno_path: WIDERFaceValCOCO.json\n",
    "    \n",
    "```\n",
    "Change yaml configurations files``` configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml```\n",
    "\n",
    "```\n",
    "_BASE_: [\n",
    "  '../datasets/coco_detection.yml',\n",
    "  '../runtime.yml',\n",
    "  './_base_/ppyolov2_r50vd_dcn.yml',\n",
    "  './_base_/optimizer_365e.yml',\n",
    "  './_base_/ppyolov2_reader.yml',\n",
    "]\n",
    "\n",
    "snapshot_epoch: 8   \n",
    "weights: output/ppyolov2_r50vd_dcn_365e_coco/model_final\n",
    "```\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* Train the model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "%cd /home/aistudio/work/PaddleDetection/\n",
    "%env CUDA_VISIBLE_DEVICES=0\n",
    "# Beginning training\n",
    "!python  tools/train.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml  --use_vdl=true "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* Model evaluation\n",
    "\n",
    "We provide ```configs/ppyolo/ppyolo_test.yml```for evaluating the effect of COCO test-dev2017 dataset, to evaluate the effect of COCO test-dev2017 dataset, you must first download the test-dev2017 dataset from the COCO dataset download page, and extract it to ```configs/ppyolo/ppyolo_test.yml```. The path configured in EvalReader.dataset and evaluated using the following command (attach the average accuracy AP and AR evaluation indicators of the model, and provide pictures or tables)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "%cd /home/aistudio/work/PaddleDetection/\n",
    "%env CUDA_VISIBLE_DEVICES=0\n",
    "\n",
    "!python tools/eval.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml  -o use_gpu=true"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Model Principles\n",
    "* Design Detection Net using Path Aggregation Network\n",
    "\n",
    "PP-YOLOv2 uses one of FPN variations, PAN (Path Aggregation Network), to aggregate feature information from top to bottom.\n",
    "\n",
    "![](https://ai-studio-static-online.cdn.bcebos.com/5f047e2e5f3c47efbb81c6cf3d81415e531133c1feff4f36a2cc13f88210ab69)\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* Use the Mish activation function\n",
    "\n",
    "PP-YOLOv2's mish activation function is applied to the detection neck instead of the skeleton network.\n",
    "\n",
    "* Larger input size\n",
    "\n",
    "Increasing the input size directly leads to an increase in the target area. This makes it easier for the network to capture information about small-sized targets for higher performance. However, larger inputs result in a larger memory footprint. So while using this strategy, PP-YOLOv2 also reduces the Batch Size."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5. Attention\n",
    "\n",
    "Whether it is PP-YOLO or PP-YOLOv2, they are looking for the most cost-effective object detection solution in industrial practice, rather than simply stacking networks and strategies to improve the accuracy of single-stage object detection. The paper on PP-YOLOv2 also specifically mentioned that it is to show more network optimization methods for industry developers from the perspective of experimental reports, and these strategies can also be applied to the optimization of other networks, hoping to bring better networks to industry developers and more algorithm optimization inspiration. At the same time, when using the PP-YOLO series, attention should also be paid to:\n",
    "\n",
    "\n",
    "* The PP-YOLO model uses train2017 from the COCO dataset as the training set, val2017 and test-dev2017 as the test set, and Box APtest evaluates the results for mAP (IoU=0.5:0.95).\n",
    "* PP-YOLO model training process uses 8 GPUs, each GPU batch size is 24 for training, if the number of training GPUs and batch size do not use the above configuration, you must refer to the FAQ to adjust the learning rate and number of iterations.\n",
    "* PP-YOLO model inference speed test is tested with single card V100, batch size=1, CUDA 10.2, CUDNN 7.5.1, and TensorRT inference speed test using TensorRT 5.1.2.2.\n",
    "* The inference speed test data of PP-YOLO model FP32 is the inference speed benchmark test result using the Paddle prediction library using the --run_benchnark parameter in the deploy/python/infer .py script after exporting the model using the tools/export_model.py script, and the test is data that does not contain data preprocessing and model output post-processing (NMS) ( Consistent with YOLOv4 (AlexyAB) test method).\n",
    "* Compared to FP32, the speed test of TensorRT FP16 removes the yolo_box (bbox decoding) part of the time-consuming, i.e. does not include data preprocessing, bbox decoding and NMS (consistent with YOLOv4 (AlexyAB) test method).\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 6. Related papers and citations\n",
    "(If this model has relevant papers published, or is based on the results of certain papers, it can be here.)\n",
    "References in Bibtex format are provided. ）\n",
    "\n",
    "```\n",
    "@article{huang2021pp,\n",
    "  title={PP-YOLOv2: A Practical Object Detector},\n",
    "  author={Huang, Xin and Wang, Xinxin and Lv, Wenyu and Bai, Xiaying and Long, Xiang and Deng, Kaipeng and Dang, Qingqing and Han, Shumin and Liu, Qiwen and Hu, Xiaoguang and others},\n",
    "  journal={arXiv preprint arXiv:2104.10419},\n",
    "  year={2021}\n",
    "}\n",
    "```\n",
    "\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3.10.6 64-bit",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.6"
  },
  "vscode": {
   "interpreter": {
    "hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
